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Abstract 

Invertible compositions of one-dimensional maps are studied which 
are assumed to include maps with non-positive Schwarzian derivative 
and others whose sum of distortions is bounded. If the assumptions 
of the Koebe principle hold, we show that the joint distortion of the 
composition is bounded. On the other hand, if all maps with possibly 
non-negative Schwarzian derivative are almost linear-fractional and 
their nonlinearities tend to cancel leaving only a small total, then 
they can all be replaced with affine maps with the same domains and 
images and the resulting composition is a very good approximation of 
the original one. 

These technical tools are then applied to prove a theorem about 
critical circle maps. 



1 Introduction 



*This is a corrected and updated version of the Stony Brook preprint "Bounded Distor- 
tion Properties of One-Dimensional Maps." Part of this work was done while the author 
was visiting the Institute for Advanced Study in Princeton. Also, work supported in part 
by NSF grant 431-3604A. 
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1.1 Review of results and techniques 



There are two ways of bounding the distortion of long compositions of one- 
dimensional maps which appear so typically when we consider high iterates 
of a map. 

Bounded nonlinear ity. One is to use "bounded nonlinearity" . The 
method goes back to Denjoy. In the modern times, we think of nonlinearity 
of a function / on a one- dimensional manifold as a form 

f" 

Aff = J jdx 

distributed along the manifold and it turns out that the distortion of a high 
n-th iterate on some interval is bounded by the integral of this form over the 
sum of the images of this interval from the 0-th to (n-l)-st. There is a nice 
description of this method with many applications to be found in [|[. 

"Koebe principle". If the map has critical points, its nonlinearity 
is non-integrable and hence no useful estimates can be obtained using the 
above mentioned method. In this context, a new estimate was found in recent 
years. Instead of the integrable nonlinearity it uses negativity (positivity) of 
the Schwarzian derivative. It was first clearly stated in ||, though it seems 
that other people had had similar ideas even before. The Koebe principle 
gives pretty good estimates, but the assumption of negative Schwarzian is 
unnervingly strong to be made. 

What we would like to know. There is another obvious observation, 
namely that any map has an "integrable nonlinearity" part and a "negative 
Schwarzian" part. The Schwarzian derivative must be negative in some neigh- 
borhood of each critical point, and beyond the union of these neighborhoods 
the nonlinearity is bounded. This observation was made and successfully 
used in a number of works. Estimates of the distortion were obtained, but 
they were typically estimates by large numbers. Sometimes, it is desirable to 
know also that the distortion is actually small. We give this kind of estimate 
in Section 2. 
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Another problem appears in conjunction with the study of universality, 
notably in the case of circle maps. It is widely believed that for circle home- 
omorphisms whose rotation number is the golden mean their differentiable 
conjugacy class depends only on the type of their "singularities''^. So far this 
has been proved in the situation of no singularities when it is the famous M. 



Herman's theorem (see fLO] , [4J] and ||) and there is a computer-assisted lo- 
cal argument in the case of one cubic-type singularity @. Herman's theorem 
implies, and is not too far from being equivalent to, that the first return maps 
on small intervals tend to be linear as the number of iterates involved grows. 
It means that the distortions acquired by consecutive iterates of the map 
tend to cancel. We used the word cancel, because their sum with absolute 
values certainly does not tend to zero. 

If we want to tackle the case when singularities do exist, we would at least 
like to know that we can asymptotically neglect the distortion coming from 
parts of the circle far from the singularities. The proof of this fact, called the 
"pure singularity property" occupies the final sections of our work. 

The emphasis of this paper is on technical problems, notably on the meth- 
ods using the " Poincare model of the interval" . The possibility of such an 
approach was realized earlier and commented on by D. Sullivan (see and 
S. v. Strien. In the present paper, new aspects and applications of this 
technique are shown. 

There are two main results of the paper: the Uniform Bounded Distortion 
Lemma in Section 2 and the Main Theorem in Section 4. The first result is 
a tool which I believe may be useful. The Main Theorem concerns universal 
properties of circle maps. The Uniform Bounded Distortion Lemma is not 
necessary in order to prove the Main Theorem, thus a reader who is only 
interested in the Main Theorem has no need to advance beyond Lemma ^J] 
in Section 2. 

This paper owes its inspiration to the graduate course taught by D. Sulli- 
van in the fall of 1988. I also express my thanks to L. Jonker, A. Epstein, W. 
Paluba and M. Samra whose keen remarks allowed me to eliminate a number 
of mistakes from the manuscript. 



meaning, I guess, points where the nonlinearity is infinite or undefined 
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1.2 Poincare model of the interval 



We start with introducing the "Poincare model" of the interval. If we are 
given an interval (a, 5) we can map its interior onto the real line by the map 

V {a ,s){l) ■= -Zo^(Cr(a,^y^,7,<J)) . 
Here, Cr(a, (3, 7, 5) is the cross-ratio defined by 

(/?-«)(* -7) 



Cr(a, (3, 7, 5) = 



<*)(*-/?) 



For any interval / let .4./ be the affine map from I to [0, 1]. It is an easy 
observation that Vi = P[o,i] Ar- Another useful fact is that the distance 
between two points x,y G (a, 5) is equal to | log Cr(a, x, y, 5) |. Thus, we 
may alternatively think of the Poincare model as the interval equipped with 
a new metric. 

Whenever we have an map from an interval / to an interval J we can 
consider = Aj o o Aj 1 mapping from [0, 1] into itself. Then we define 

Thus we have defined the operator V which assigns to every map from 
an interval to another interval its "Poincare model map". This definition 
depends on the choice of the domain and image of the map. 

We will need to understand the action of this operator on the group 
of orientation preserving self-homeomorphisms of [0,1]. The operator then 
establishes an isomorphism with the group of orientation-preserving home- 
omorphisms of the real line. Linear-fractional maps of the interval become 
translations, maps with negative Schwarzian derivative are mapped to ex- 
pandings of the line and maps with positive Schwarzian correspond to con- 
tractions. 

For every map on an interval we define its Poincare distortion norm 
(or simply distortion norm if there is no danger of confusion) T>(4>) by 

P(0) = ||7>(0)-id|| c o. 
Next, for any function defined on an interval (x,y) we define 
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/ / \ I0(O,2/))I /-.n 

P(x,y; (f) ):= 1( ^ r . (1) 

There is a correspondence between these quantities: 

Lemma 1.1 For any orientation preserving homeomorphism <fi : (x, y) — > J 
and 7 G itra'i/i P(x,y)(7) = Y 

log— - = ^(0)(7)-7 • 

Proof: 

Since we can pre- and postcompose <fi with affine maps and that will not 
change the quantities we are interested in, we may assume that x — 0, y — 
1, J — (0, 1). Then we simply compute 

The absolute value of this quantity is the same as the Poincare distance 
between 7 and 0(7) and the sign is correct provided that preserves the 
orientation. The claim follows. 

□ 

The meaning of the distortion norm T>. A more usual measure of 
distortion by an interval diffeomorphism / whose domain is [a, 5} (i.e. it 
extends a little beyond (a, 5) as a smooth map) is 

sup{| log|^y| : x,y G [a, 5}} . (2) 

First, we notice that 

where 7' is the image of 7 in the Poincare model. This follows immediately 
from Lemma |1 . 1| . Since the analogous statement is valid for 5, we get that 
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Thus, the usual distortion norm given by formula ^] is bounded by twice 
the supremum of the Poincare distortion norms for all restrictions of the map 
to subintervals. 

Fortunately, our future estimates of the Poincare distortion norm will 
have the property that they are uniformly good for all restrictions to smaller 
intervals. 

On the other hand, the norm given by formula ^| is obviously larger than 
the Poincare distortion norm. 



2 Uniform Bounded Distortion Lemma 

We will show an estimate quite similar to the Koebe principle , save that 
we will not require the Schwarzian to be of a definite sign. All we need is 
that the function is a composition of many functions, some of which have 
non-negative Schwarzian and the joint distortion of others is bounded. This 
is what typically happens when we consider a high iterate of a function. 



2.1 The formulation 

Standard compositions. We consider a function / defined on an interval 
(a, d) of the following form: 

/ := <7 m o h m o ■ ■ ■ o oi o hi , f = id . (3) 

We will also use the notation: 

fk '■= &k ° h k o ■ ■ ■ o oi o hi, k <n . 

All maps are defined on intervals and are order-preserving homeomor- 
phisms onto the domain of the next map. Maps a, are assumed to have 
non-negative Schwarzian derivative. 

Next, we define two distortion "norms" for any such composition. 

• The number di is equal toQ 



^ mf{0, log — — r- :a<(3<-f<5e fi((a, d))} 

p{a, 5 ;hi) ■ p{P, 7; hi) 



(4) 



2 Caution! d\ is negative. 
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Since the argument of the logarithm represents the change of a cross- 
ratio under the map, the i-th term in the sum is zero provided that hi 
has a non-negative Schwarzian derivative. 

• The number d 2 is 

m 

d 2 :=J2 V ( h i) ( 5 ) 

i=i 

The "norm" di only gets closer to if we consider the restriction to a 
smaller interval. In order for d 2 to be uniform with respect to restrictions, it 
is sufficient to demand that the sum of "log-ratio of derivatives" norms for 
maps hi be bounded. 

In addition, we assume that 

max{\D(hi)\ : < % < n} < log2 . 

We will now see that the last requirement is only technical and can be 
satisfied in each example. More precisely, if this condition initially is not 
satisfied, we can write the same function / as another composition for which 
the norms d\ and d 2 stay the same, but the distortion norms of functions hi 
become suitably small. 

The way to do it is by rewritting all maps hi whose distortion is too large 
as compositions of many maps already with suitably small distortion norms. 
For each V{hi) we define the family 

V(h\) :=id + t{V{hi) -id) . 

Obviously, h\ = hi and by dividing the interval [0, 1] into sufficiently many 
subintervals we represent hi as a composition of maps with small distortion 
norms. The norms d\ and d 2 will stay the same. 

Compositions satisfying these assumptions together with their norms as 
defined above will be called standard compositions. 

Uniform Bounded Distortion Lemma. 

The technical statement. // a function f is a standard composition 
of length n defined on an interval (a, d), then for any interval (6, c) C (a, d), 
the distortion of f on {b, c) is bounded, namely: 

V(f\<M) < 
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c - b 

< Qd 2 exp(|di|) min(l, — — ; -) + d 2 + 2\di + log(Cr(a, b, c, d))\ 

min(b — a,d — c) 

where Q is a constant quite independent of the composition. 

A simplified statement. Suppose we have a standard composition de- 
fined on an interval (a, d). Then, its Poincare distortion on an interval (b, c) 
is bounded by 

di + d 2 + K(di, d 2 ) | \og(Cr(a, 6, c,d))\ 

where K(di,d 2 ) is a constant depending only on d\ and d 2 in a continuous 
fashion. 

We leave it to the reader as an easy exercise to see that the simplified 
version follows from the technical version. 

A comment. We want to compare the classical Koebe priciple with our 
Uniform Bounded Distortion Lemma and other estimates. One way to state 
the classical Koebe priciple is this: 

Koebe principle If g is a diffeomorphism defined on an interval (a, d), 
and the Schwarzian derivative of g is non-negative, then the nonlinearity 
coefficient of f is bounded, namely: 

i /"(*).< 2 



f'(x) min(x — a,d — x) 

If we replace min(a; — a,d — x)' 1 with l/(x — a) + l/(d — x) and integrate 
from y to z, we get 

\log(f'(y)/f'(z))\<2\\ogCr(a,y,z,d)\. 

Thus, it becomes clear that the simplified version of the Uniform Bounded 
Distortion Lemma is a natural generalization of the Koebe principle. 

As such, it is slightly stronger than estimates known so far (see || and 
U for examples.) Those earlier estimates let us bound the distortion by a 
uniform constant, while both the Koebe principle and our lemma also give 
conditions for the distortion to be small (namely, c — b small compared with 
the distance from {a, d} and the "/i-contribution" small.) I know of one 



example, [|Tl|], when that makes a difference. 
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Finally, let us mention a strong recent result of which implies that our 
d\ "norm" can be controlled in terms of the Zygmund norm of U /^.f] 

2.2 Proof of the Uniform Bounded Distortion Lemma. 

The obvious approach to the proof of the Uniform Bounded Distortion Lemma 
is by reducing the situation to the Koebe principle. It will, however, require 
rearanging the order of functions a and h. That would be an easy thing to 
do if both functions had the same domain: 

aoh = (croho cr^ 1 ) o o 

and then we could regard the function in parentheses as a new h. Although 
the interval maps usually do not have the same domains, their Poincare 
model maps are all homeomorphisms of the whole line and so we play this 
trick with them. 

Then, two problems will appear: first, whether after the rearangements 
the new functions h will preserve the bounded distortion properties; secondly 
whether we will indeed be able to use the Koebe principle afterwards in order 
to bound the distortion of Poincare models of maps oi. 

The reshuffling procedure. The next lemma tells us what happens to the 
distortion norms of maps hi when we change the order of maps as described. 

Lemma 2.1 Let f be a standard composition. Then we can write V(f) as 
the composition 

h m o . . . o hi o V(a m ) o . . . o V(ai) 

with 

m m 

£ sup{|7^( 7 ) - 7 | : 7 e R} < £XW . 

i=l i=l 

Proof: 

We start with rearanging the order of maps in the composition so as to get 
the contractions first .0 

3 That is, in terms of the Zygmund norm of the derivative in case when the standard 
composition is an iterate of the map and intermediate images of the domain are disjoint. 

4 A similar reshuffling, albeit in a different context of complex quasiconformal exten- 
sions, was used by D. Sullivan (see [Q). 
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We obtain the expression of the form: 



h m o h m —i o . . . o h\ o a m o . . . o <ti , 
where <7j means V{pi) and hi is of the form 

Oi o V{hi) oaf 1 

with Wi being a certain composition of the functions Oi and therefore being 
a non-expanding map. 
We compute: 

1^(7) -7I = |o r i°^(^)°o r r 1 (7)-7i = |o r i°^(^)°o r r 1 (7) -^°o r r 1 (7)i • 

Since <7j is a contraction, the last expression is not greater than 

lP(/ il )oar 1 (7)-^r 1 (7)i<^). 

□ 



We can apply Lemma |2.1| to the function / restricted to the interval (b, c) 
to find out that 



m— 1 



Wkm) < 51 P ( /l i+i|/i((M)+) ( 6 ) 

i=0 

sup{|P(a m | hm0/m _ l(M ) o . . . o P((7 1 |^ (6)C) )(7) — 7I : 7 G R} . 

The first sum in the inequality ^| can be sufficiently sharply bounded 
by g?2- What remains is to bound the distortion of maps with non- negative 
Schwarzian in the second term. Naturally, we are going to achieve that using 
the Koebe principle, however we must be careful about the domains. 

The issue of domains. We proceed as follows: A map <?j is defined to 
be the linear fractional map that agrees with the map hi on the points 
fi-i( a ), fi-i(b), /i_i(c). At the same time, we assume without loss of gen- 
erality that b — a < d — c. We further consider maps F and Fi defined in 
the analogous way to / and fc, save that the functions hi are replaced by <fe. 
The function F has non-negative Schwarzian, but in order to use the Koebe 
principle we need to prove that it is defined on sufficiently big neighborhood 
of (b, c). It is certainly defined on (a, c) and we want to choose a point d' > c 
so that F is defined on (a, d') as well. This problem is solved by the next 
lemma: 
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Lemma 2.2 If we choose d! so that 



(d'-c)(b-a) 
(c-b)(d'-a) 



exp(di) • 



(d — c)(b — a) 
(c — b)(d — a) 



then F is defined on (a, d 1 ) . 



Proof: 

To check that we need to ensure that 



9i{Fi-i(d!)) < KUi-M) 



holds for every i < m. To prove that we will show that 



(9i(Fi-i(d')) — gj(Fj-i(c))) ■ {gi{Fi-M) - g^F^ja))) 
(5i(i^-i(c)) - gm^b))) ■ (gi(Fi-x(d')) - giiF^a))) 



< 



(Wi-i(rf)) - Wi-i(c))) • (fei(/i-i(6)) - Wi-i(a))) 
{hi(fi-i(c)) - hiif^b))) ■ - Hfi-ii"))) 



(7) 



The two complicated ratios above represent values of a certain cross-ratio 
on the images of points a, b, c, d! and a, 6, c, d respectively. Whenever we apply 
an order preserving homeomorphism <fi to points a</3< , ~f<5'<5 the 
changes imposed on these cross-ratios will be related in the following way: 

p{i, &';<!>) -p(<x,l3; 4>) . p(7,8;<t>) ■p(a,P;<l>) = p(n/,5';<l>)- p(<*,&; <t>) 

P(P, 7; <t>) ■ p(a, S'; (f>) ' p(j3, 7; 0) • p(a, 5; 4>) p(a, 8'; 0) • p(7, 5; 4>) ' 

The last expression is distortion of some kind of cross-ratio. Using the method 
from [[| we can show that if (ft has a non-negative Schwarzian, this number 
is not greater than 1. That means that provided (|7|) holds with some i, 
the subsequent application of o~i will not increase the ratio of the left-hand 
side of (|7|) to the right-hand side. Next, we will apply to the points 
Fi(a) , Fi(b) , Fi(c) , Fi(d') which is not going to change the cross-ratio at all; 
and h-i+i that will be applied to the points fi(a), fi(b), fi(c), fi(d) can decrease 
the cross-ratio by some factor (. As this reasoning shows, the ratio between 
the left-hand side of ([?]) to the right-hand side, will grow by no more than £ 
as we pass from i to i + 1. Our assumptions imply that the total growth as 
we pass from i — ltoi — n — 1 will be no more than exp(— d{). 
So, the lemma follows. 
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□ 

From now on, we take (a, d!) as specified in Lemma [2.2| to be the default 



domain of F. 

By the Koebe principle we get the estimate for the nonlinearity coefficient 
of F, i.e. nF := /"//'. 

2 2 2 
nF < — — r < h 



mm{7 — a, d! — 7} 7 — a d' — 7 
Integrating this inequality over (6, c) we get 

sup{| log(F'( 7 )) - log(F'(/3))| : 7 ,/3 e (6,c)} < 

-2 log(Cr(a, 6, c, a")) = -2di - 2 log(Cr(a, 6, c, d) . 

The quantity on the left-hand side of the last inequality is certainly not less 
than P(i^| (6iC )). 
Thus, 

sup{|P(cr m | (feiC) ) o . . . o P0i|( fe>c ))(7) - 7| : 7 6 R} < (8) 

m 

-2di - 2 log(Cr(a, 6, c, d)) + ^ ^ > (^«|j? 4 _ 1 (6,c)) • 

i=i 

The distortion of the linear-fractional part. The first two terms on 
the right-hand side of the inequality || can also be found the statement of 
the Uniform Bounded Distortion Lemma. What remains to be calculated is 
the last sum. 

Lemma 2.3 Let g be a linear fractional map defined on an interval (a,c), 
points b and X belong to (a, c) and satisfy b < A. Then 

logp(6, A; g) - logp(A, c; g) = logp(a, b; g) - logp(a, c; . 

Proof: 

This is a straightforward computation: 

logp(o, A; a) - logp(A, c; p) = log — — - — - - log — — - — - 

A;fif) p{a,X;g) 
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= log p i M x ig) ;1r ;g l + ^ 4^4 + cc»> w» - pw) . 

P(«, A; g) ■ p{b, c; g) p{a,c;g) 

We now notice two facts: that the first term is 0, because it is logarithm of 
the distortion of some cross-ratio and g is linear-fractional; next, that we can 
replace -P(A) with any other point, provided V(g) is an isometry. Therefore, 
the last expression is equal to 

log— + {V{g){P(b)) - P(b)) = log— + log— = 

p{a,c;g) p{a,c;g) p{b,c;g 

log p(a, b;g) - log p(a,c;g) 

□ 



Final estimates. We are now ready to conclude the proof. From equation 
|8| and Lemma [2.3| we get 



SUp{|P(cr m | /lm o/ m _ 1 ( 6 , c )) o . . . o V{(Jx\ hl (b, c )){l) - tI : 7 e R} (9) 

< -2 log(CV(a, 6, c, d )) - 2d x + ^ I \og——-——- — - . 

i= i p{Fi-i{a),Fi-i{c);gi) 

We will prove a simple computational lemma that will enable us to eval- 
uate the last sum. 

Lemma 2.4 Suppose that a function is defined on an interval (a, c) with 

< log 2 and a point b G (a, c). Then 

log— — < Q ■ V{<p) mm(l, . 

p(a, c; <pj o — a 

where Q is a uniform constant. 

Proof: 

(c)-^(fc) 



p(a,c;0), n /0(c) -0(a) c-a 1 + 



0(6)-0(o) 



IOg pM;0)' |log( 0(6)-0(a) : 6-a }l |log ' 1 + g 

1 + exp(logp(6, c; 0) - Iogp(6, a; 0)) • £=| 
= |log 1 

o—a 
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Our final estimate will depend on how we bound the argument of the 
logarithm in Formula [l(| One possible estimate is 



1 + esxp(V(<f>)) ■ 

^ b ~ a < exp(D(0)) < ZD(<j>) . 

1 + b-a 



If is small, we can expand formula |1(] into a series, and get an estimate 

c-b 
b—a ' 

□ 



proportional to r-^-. 



We will apply Lemma |2~4] in the situation when := gi, a := Fj_i(a), b : = 
Fi_i(b) , andc := Fj_i(c). 

We first note that P(<7j) < T>{hi). Then we recall that the distortion 
norms of maps hi can be assumed to be as small as we want, in particular 
could be less than log 2. Thus the assumptions of Lemma |2.4j are satisfied. 

It enables us to bound the second sum in Formula |^ by 

Q±V( hl )rnin(l J;-^- f ;-fh (11) 



What we need is to estimate the ratios in (|TTD in terms of f^-To do that, 
we consider a cross-ratio CR(a, P,"f,S) defined by 



f if\ f M > CR{h{a), /«(&), /((c), f^d)) > exp(-dO • CR{a, b, c, d) 
fi{c) - fi(b) 

, s b — a d — c , s b — a 1 

> exp — di) ■ — r ; — > exp — cM r ■ — r 

FV 'c-b b-a + (d-c) + (c-b) ~ FV 'c-b 2 + £=| 

hence 

fi(c)-fi(b) c-b c-b 

—— — - < 2exp(-di) • (1 + ) 

fi{b) - Ji{a) b-a b-a 
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Thus, if < 1, we get 



m - m 

flip) -Ma) 



< 4exp(— di) 



c-b 
b — a 



We can finally bound the expression (11) by 



4Qd2 exp(— d\) min(l, 



c — b 
b — a 



which allows us to estimate the whole @ by 

- 2 log(Cr(a, b, c, d) - 2d x + AQd 2 exp(-di) min(l, J— -) . (12) 

b — a 

This concludes the proof of the Uniform Bounded Distortion Lemma. 

3 Functions hi with cancelling distortions 

In our formulation of the Uniform Bounded Distortion Lemma we assumed 
that the distortion of the composition depends on the sum of distortions 
of individual maps hi. While that may often be a good estimate, it leaves 
the case when distortions of maps cancel without satisfactory solution be- 
cause it does not offer any way in which we could account for cancellations. 
For example, high iterates of critical maps of the circle can be regarded as 
compositions of the form considered by us with maps hi being nearly linear- 
fractional with nonlinearity totaling to close to zero. The basic question is 
whether in that situation we can ignore the maps hi completely and approx- 
imate the whole composition V(f) by composition of maps V{<Ji) only. The 
Cancellation Lemma formulated below is a good tool to be used in such situ- 
ations. The Lemma is a nice illustration of the power of the Poincare model 
approach. To the best of the author's knowledge, no other technique has 
yielded a similar result. 

Cancellation Lemma. Let us consider a standard composition defined on 
an interval (a, b) 



/ = /, 



m 



o • • • o <7i o Hi . 
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We further assume that each map Hi can be written as a composition 



Hi = hiO g, t 

where hi is a linear-fractional map. 
We denote 

k 

D:=J2\\V( gj )-id\\ c o 
i=i 

and 

j 

A := max{| ^(P(^) - id)\ : 1 < j < k} . 
i=i 

S m :=V{a m )...V{a l ) 

Then, 

||P(/)-5 m (x)||co<D + 2A. 

A comment. What the Cancellation Lemma tells us is that if we have a 
composition where maps of not neccessarily positive Schwarzian are all almost 
linear-fractional and their distortions almost cancel, then we can replace these 
maps with affine maps and still get a good aproximation of the composition, 
at least locally. The main value of this lemma is that its assumptions are 
verified for first return maps of circle homeomorphisms as we prove in the 
next section. 

Beginning of the proof. The proof of the Cancellation Lemma is not 
very easy and will occupy the next section. Here, we just make the first step 
which is the elimination of maps gi from the problem. 

To achieve this, we put together maps hi and Oi and get Sj := <7j o h^ 
Then / is a standard composition of the form 

f = S m o g m o ■ o Sl o 9l . 



To this standard composition we apply Lemma |2J] and what we get is 
that 

V(f) = GoV(s m )o...oV( Sl ) 

in which 

||G-id||co < D . 
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The reduced Cancellation Lemma. As the preceding argument shows, 
the functions gi can be omitted from the composition defining / and if we 
then prove that 

IW)-5 m (x)||co<2A 

this will immediately imply the Cancellation Lemma. So we make this ad- 
ditional assumption and call the resulting auxiliary theorem the "reduced " 
Cancellation Lemma. 

3.1 Proof of the reduced Lemma. 

Extending functions hi. The standard composition / is now written as 

/ = fm = Cm O h m O ■ ■ ■ O Q\ O hi 

with all maps hi linear-fractional. 

We then extend functions hi to one-parameter families h\ defined by 

V(h\)(x) = x + t- (V(hi)(x) -x) < t < 1 . 

In particular, h® = id and h\ = h; L . The maps like /* are then defined in 
the obvious way. 

More important notations. To simplify our future equations we de- 
fine: 

n*(x) := V(hl) o VifUix) - V(a^) 0...0 Via^x) (13) 

and 

6i:=V(hi)(x)-x (14) 
which quantity is independent of the choice of x provided hi is linear-fractional. 

3.2 Estimates 

With these notations we are ready to prove our basic lemma. 
Lemma 3.1 

da* 



dt 

for 1 < i < m and any x. 



(x) < 2A 
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Proof: 

We compute the derivative in question: 

dt ^ j^ x \}. dx \v(h\ofH*)) dt Wj 

Here, we used the fact that the maps V(hj) are all isometries, thus their 
derivatives can be skipped in the product. For simplicity, we will denote 

:=-,(«) 

iJ-j dx \V{h\ofl{x)) 

Since maps <7j were assumed to have non-negative Schwarzian, their Poincare 
models are weak contractions; thus Oj +1 (t) > aj(t). 
Formula 1151 can then be rewritten as 



dVL 1 i 
ai j=1 

Now we use the famous Abel's series transformation to bound this quantity: 

i i j i-i 

jf=l j=l A:=0 fc=0 

where we adopted the convention 5 = 0. 
This can be further rewritten as 

i— i j « 

XX ( a j - a i+ i) X 4) + X a ^ fc • 

j'=l k=0 k=0 

The absolute value of the last expression is easy to bound. The last term 
does not exceed A and the first one can be bounded by 

i-l 

X \dj — %'+i|A < A 

3=1 

as the numbers a,j form a non-decreasing sequence and are bounded by 1. 
This concludes the proof. 

□ 
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The conclusion. Lemma |3.1| allows us to bound sup{fi^(x) : x G R} by 
2 A, but this is exactly the statement of the reduced Cancellation Lemma. 

4 Pure Singularity Property 

A famous theorem of M. Herman says that smooth diffeomophisms of the 
circle are smoothly conjugated to rigid rotations, provided some diophantine- 
type conditions are verified. One way to look at the smooth conjugacy is that 
in a small scale it becomes C 1 -close to linear. There is a dynamically defined 
event which takes place in a small scale. This is the first return map to a 
small interval. Hence, the smooth conjugacy tells us that first return maps for 
any diffeomorphism will tend to the first return maps for the rigid rotation, 
that is, to linear maps. 

The first return map is an "induced map", which means that piecewise 
it is a high iterate of the initial map. It is also known that all intermediate 
images of the pieces of its domain are disjoint and cover the circle com- 
pletely. On each piece, the distortion is the total of distortions acquired by 
consecutive iterates on corresponding intermediate images. Precisely, we can 
consider nonlinearity defined in the introduction and then it turns out that 
the nonlinearity of the first return map on each piece of its domain is equal to 
the sum of nonlinearities in all intermediate images transported by iterates 
of the map. 

Since the linear map is characterized by N f = 0, in Herman's theorem the 
distortions must cancel. This not so surprising, perhaps, since the integral 
of the nonlinearity over the whole circle is 0. But it is a remarkable fact and 
in certain approaches the central issue in the proof of Herman's theorem. 

Naturally, we would like to know to what extent this fact is true for critical 
circle maps. For simplicity, we will consider maps with only one critical point 
such that coordinates can be changed C 3 smoothly to make it locally the map 
x — > x@ + e. It is widely conjectured, and in few very restricted cases has 
been argued with a computer's assistance, that if we choose the domains of 
the first return map suitably , the sequence of the first return maps will also 
approach a unique limit. But this limit map is everything but linear: its 
distortion does not not vanish and neither does its Schwarzian derivative. 

Nevertheless, we prove that, in a certain sense, only distortions acquired 
in an immediate proximity of the singularity count. Distortions due to the 
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part remote from the singularity will tend to cancel, just like in the diffeo- 
morphisms' case. 

The proof that distortions cancel that we give is somewhat similar to the 
proof of Herman's Theorem given in [[| . 

4.1 Assumptions and the statement of results 

The class of maps we are working with. We will consider orientation- 
preserving C 3 -smooth circle homeorphisms with one critical point of the poly- 
nomial type, at this moment of any rotation number. 

As a consequence, (see J7|), the circle is covered by two overlapping open 
arcs. There is a "remote" arc on which we assume that the first derivative 
is bounded away from 0. On the other "close" arc the map has non-positive 
Schwarzian derivative. 

We reserve the notation / for maps in this class. 

These assumptions are a little bit stronger than necessary for our esti- 
mates to work, but we prefer not to obscure the idea by technicalities at this 
point. We will discuss weakening of our requirements in the course of the 
paper. 

Some terminology. A symmetric neighborhood is a neighborhood of 
the critical point which is contained in the close arc and the derivative of the 
function is the same in both endpoints. It follows from our assumptions that 
a symmetric neighborhood is also almost symmetric in the ordinary sense - 
the critical point is in a bounded Poincare distance from the the midpoint. 

A chain of intervals is a sequence of intervals such that each is mapped 
onto the next by the map. We will be particularly interested in chains of 
disjoint intervals. Obviously, there always is a map associated with a chain, 
namely the composition leading from the first interval to the last one. 

The continued fraction approximants of the rotation number will be de- 
noted with 

Pn 
Qn ' 

The denominators q n are important from the dynamical point of view, since 
they determine the times of closest returns by the orbit of a point to the 
point itself. The numbers satisfy the relations: 

g_i = , g = 1 , Qn+i = a n q n + q n -i 
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where the coefficients a n are defined by the continued fraction expansion of 
the rotation number. An elementary discussion of the topological dynamics 
of diffeomorphisms with an irrational rotation number can be found in |J . 
An interval J is said to be of the j-th order of fineness if 

j = max{i : Vx G J f Ql {x) £ J} + 1 . 

A uniform constant is a function on our class of maps which continu- 
ously depends only on the quasisymmetric norm of the map, the logarithm 
of the size of the close arc, the lower bound of the derivative on the remote 
arc, and the C 3 norm. 

In view of this definition, "absolute" constants like e n are uniform. Per- 
haps a more meaningful example is the statement: 

Let f be a smooth circle homeomorphism with an irrational diophantine 
rotation number. For each natural n, the derivative of f n is bounded by a 
uniform constant. 

Without the word "uniform" the statement would be obviously true. As 
it is now, the main problem is whether the bound depends on n. If / is a 
diffeomorphism, it does not, and the sentence remains true.f] If / is a critical 
map, the statement is false: the derivatives must become very large as n 
grows. 

A notational convention. There will be so many uniform constants 
in use in the future discussion that we feel a need for a special notational con- 
vention to handle them effectively. Notations like K. will be used exclusively 
for uniform constants. The subscript will identify the particular constant. 
All uniform constants will be introduced in lemmas, propositions or facts. 
The rule is that in the statement in which a constant is first defined, as well 
as in its proof, the constant will be identified by a single numerical subscript. 
The same subscript may denote different constants in different lemmas. 

However, when we use the constant later, its single subscript will be 
followed by an indication of where it was introduced. For example, the 
constant K\ introduced in Lemma 10.15 will be called K\ in the proof of 
Lemma 10.15, but later will be referred to as i^i,i.io.i5- 

D Which follows from Herman's theorem. 

6 Otherwise, the map would be smoothly conjugated to the rotation, see ]To|. 
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Approximate maps. To describe the procedure of approximating maps 
we need two objects : a neighborhood^] of the critical point and a chain of 
intervals. To approximate the composition associated with this chain we do 
the following: 

• If an interval is contained in the neighborhood, we leave the map defined 
on it to be /. 

• Otherwise, instead of / we use an affine map with the same image. 

Main Theorem. Let us suppose we have a chain of intervals 

(a ,b ), (ai,h), . . . , (a m ,b m ) , 

none of which contains the critical point, of the n-th order of fineness and a 
symmetric neighborhood U with the fineness of order A. This also assumes 
that the length of the continued fraction expansion of the rotation number is 
at least k. We then approximate f m on (ao, &o) to get some <fi. The result is 
that: 

If k > A then: 

with K\ and K 2 being uniform constants depending only on global distortion 
properties of f and K 2 < 1 . 

A comment. The Main Theorem proves what we want to call infor- 
mally "the pure singularity property". If we have enough smoothness we 
can change coordinates so the resulting map is in our class and moreover its 
critical point is locally in the form x — > x 13 + /(0). The Main Theorem then 
asserts that asymptotically only what happens in this small neighborhood 
matters and that is why the expression "pure singularity property" seems 
appropriate to the author. 

4.2 General strategy of the proof 

The proof will largely use the concept of Poincare model of the interval 
which is explained in earlier sections of this paper. We will also use the 

7 usually symmetric 
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same notations in this section. The main tool of our proof is going to be the 
cancellation Lemma. 

An outline of the argument. There is an obvious way we can use the 
Cancellation Lemma to prove our Main Theorem. Maps a"j will be the iterates 
of / _1 which are left unchanged when we replace f~ m by its approximate 
map. Then, the claim of the Cancellation Lemma is exactly what we assert in 
the theorem. Hence, our effort will be aimed towards verifying the hypotheses 
of the Cancellation Lemma and then the main technical problem will be to 
bound A. We address these issues in the next section. 

5 Proof of the Main Theorem. 

We assume that the assumptions of the main theorem hold; in particular 
that we are given a chain of disjoint intervals. 

5.1 Bounded distortion of critical circle maps 

Dynamical partitions. The forward orbit of the critical point defines 
a sequence of partitions of the circle, called dynamical partitions. For any k 
less than the length of the continued fraction representation of the rotation 
number, the 0, . . . , a^qk images of the interval (0, / 9fc_1 (0)) are called lengthy 
intervals. The 0, . . . , q k -i — 1 images of (f qh (0), 0) are called short inter- 
vals. Together, lengthy and short intervals form a partition of the circle and 
this is exactly what we are going to call the dynamical partition of the 
k-th order and will be denoted D k . 

Consecutive dynamical partitions as refinements. We will now 

examine a very simple correspondence between dynamical partitions of order 
k and k + 1. The latter clearly is a refinement of the former. More precisely, 
all short intervals of the partition of order k will become lengthy intervals of 
the next partition, while the lengthy intervals of the coarser partition will be 
subdivided. Each lengthy element of the partition of order k will be split into 
a number of lengthy intervals as well as one short interval which all belong 
to the partition of order k + 1. 
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Bounded geometry of dynamical partitions. The properties which 
are commonly referred to as "bounded geometry" are summarized by the 
following statement: 

Fact 5.1 If f satisfies our regularity conditions, then: 

• The ratio of lengths of two adjacent elements of any dynamical partition 
is bounded by a uniform constant K\ . 

• For any element of any dynamical partition, the ratios of its length 
to the lengths of extreme intervals of next partition subdividing it are 
bounded by a constant K 2 . 

Unfortunately, this Fact belongs to the "folk wisdom" and there is no 
clear reference to the proof. In one particular case when orbits of the critical 
point are periodic, Fact |5.1| was verified in the work J7| . Then, M. Herman 
showed how to carry this sort of estimates over to the more general situation 
(see @.) 

Lemma 5.1 There is a uniform constant K\ such that the elements of D\ + k 1 
adjacent to are contained in U.f\ 



Proof: 

This follows immediately from Fact 
symmetric. 



if we take into an account that U is 



5.1 



Coarseness of dynamical partitions. We will need the fact that f" 1 on 
elements of Dj with j much larger than A is almost linear-fractional with 
almost constant nonlinearity. To prove this, it is crucial to know that the 
partition Dj is fine enough. To establish this fact is the purpose of the 
following three lemmas. 

Definition 5.1 Given a set V, the coarseness of the partition Dj outside 
V, denoted Cj{V) is defined by: 

<*00 = £ (5S7To)) 2 ■ 



8 We remind the reader that U is fixed and defined in the statement of the Main 
Theorem. 
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Lemma 5.2 Let us fix j and let V be the interior of the union of two ele- 
ments of Dj adjacent to 0. Then, 

c j+1 (V) < K lCj (V) 

where K\ is a uniform constant less than 1. 

Proof: 

This is an immediate corollary to Fact |5J] and the definition of coarseness. 

□ 



Lemma 5.3 Let V and j be related as in the statement of Lemma 571. Then 
Cj(V) is uniformly bounded by some K\. 

Proof: 

Let us define V to be the union of the elements of Dj + \ adjacent to 0. 
The first observation is that c J+ i((S' 1 \ V) U V) is uniformly bounded as a 
consequence of Fact |5.1|, second part. This in conjunction with Lemma [572 



implies that Cj + i(V) can be bounded recursively by Cj(V) times a constant 
less than 1 increased by a bounded amount. This recursive bound implies a 
bound uniform in j. 



□ 



Lemma 5.4 For j > A + -R^lO, 



Cj (U) < K\K\ 



with Ko < 1. 



Proof: 

This follows immediately from Lemmas I57T], |5.2j and FTB 



□ 
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5.2 Preparations to use the Cancellation Lemma. 

How to represent f~ m in the Cancellation Lemma ? We choose maps 
(Tj to be the iterates of f^ 1 on intervals contained in f(U) and others will be 
Hi. To obtain a composition in the form postulated by the hypotheses of the 
Cancellation Lemma we may have also to insert maps Oi equal to identities 
between consecutive maps Hi. 

Further choices. The problem which still remains is a judicious choice 
of maps hi. We will simply give a prescription: We look at the nonlinearity 
of Hi and hi will be the homography that maps the domain of Hi onto its 
image, and satisfies 



The assumptions of the Cancellation Lemma are then satisfied. 

Next thing we need is to estimate constants which appear in the Cancellation 
Lemma. It is relatively easy to deal with D and we are going to consider it 
first. 

We introduce a map g\ as the map with the same image and preimage as 
Hi, the same nonlinearity integral and constant nonlinearity. First, we will 
estimate the Poincare discrepancy between Hi and g[ . 

Lemma 5.5 The difference 



V(h t )(x) = x-l/2- f Af(Hi) 



JDm(Hi) 



Wm-V(g'i)\ 



is uniformly bounded by 




dist ((aj, hi), 0) 



Proof: 
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First we precompose the maps with affine functions so as to have them 
defined on the unit interval. We observe that the nonlinearity coefficients of 
the rescaled map are of the order of 

I (a* A) I 



dist ((a;,&j),0) ' 

With a slight abuse of notation we will still use the same symbols to denote 
the rescaled maps. 

Let •* denote the transport of forms by functions, and we compute 

= (g-TWH^) +M{ 9 [) - (gTW(g'); 1 ) + (sOWO/Or 1 ) = 

= (gTWH^))-(g' i )*(Af((g')^)). 

Hence, it is enough to estimate the coefficient of 

M{H^)-M{g')r l = (H-r^Hd - (GflrWsO ■ 
Finally introducing the derivatives explicitly we get: 
{H-yNm - (gf'Wg'i) 

But it is evident that the estimate we want follows. Since the coefficients 
of nonlinearities are bounded as noted at the beginning of the proof, the 
derivatives are 

i + o(J^L). 

y dist((a i ,b i ),0) J 

The lemma follows. 

□ 

Lemma 5.6 The difference 
is uniformly bounded by 

K , \(ai,bi)\ )2 
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Proof: 

By solving a corresponding differential equation, we find out that a function 
with constant nonlinearity from the interval [0, 1] to itself is given by 

exp(nx) — 1 

x -> — 

exp n — 1 

where n is the nonlinearity coefficient. 

Next, we are going to find the displacement in the Poincare model for the 
image of any point x G (0, 1). We will discard terms quadratic or higher in 
n. 

If x 1 means the image of x we get 

x' = x + -nx 2 — -nx + 0(n 2 ) = x(l + -nx — -n) + 0(n 2 ) 

and 

1 — x = 1 — x — -nx 2 + -nx + 0{n 2 ) — (1 — x)(l + -nx) + 0{n 2 ) . 

Zj Zi Z 

The Poincare displacement is 

, x(l -x') , . 1 + \nx 2nx 

- log ^ = - log ! 2 — + 0(n 2 )) 

(1 — x)r 1 + ±nx — |n 

111 1 

= log(-nx n nx + 0(n 2 )) = — n + 0(rz 2 ) . 

2 2 2 2 

Since n is of the order of 

I fa, ; , k) I 



cfet ((a,, bi), 0) 

this concludes the proof. 



□ 



Our efforts are crowned by the following proposition: 

Proposition 1 If a chain of disjoint intervals satisfies the assumptions of 
Main Theorem, then our choice of maps Qi and hi gives the bound for D by 

D < K X K*- X 

with K2 < 1. 
Proof: 

Follows immediately from Lemmas |5.5| , |5.6| and |5T4 . 

□ 
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A technical comment. We could have done our estimates separately 
on the close arc and the remote arc. While the author does not see any better 
method on the close arc, on the remote arc it would be enough to assume 
that the second derivative is only Holder continuous. 

How to bound A ? Our main remaining problem is an estimate for A. 
We have to be able to see that this actually tends to 0. By the definition of 



where Cj is a union of these intervals of the chain from to some ij that are 
not contained in U. 

In order to prove Main Theorem we need to show that this integral is 
uniformly exponentially small in y/n — \. We call this the main estimate. 

5.3 Main estimate 

Beginning of the proof of the main estimate. We consider the set U 
which is equal to the complement of U together with intervals of the chain 
that are partly contained in U. We regard U with the suitably normalized 
Lebesgue measure as a probability space. Then, the characteristic function \ 
of the chain can be viewed as a random variable, the dynamical partitions are 
a- algebras of events, and individual elements of those partitions are events. 

We will consequently use the language of conditional expectations, de- 
noted with E(x\-) where the dot can be either a partition or a set (event.) 
The intuitive interpretation of E{x\T^j)i f° r example, is the function whose 
value on each element of the partition is equal to the relative measure of the 
chain on this element. 

The main estimate will follow from this proposition: 

Proposition 2 



for any x in the element of Dj completely disjoint with U where X < j < k 
and K2 < 1, both K\ and K2 being uniform constants f\. 

9 We also remind the reader that n is the fineness of the chain. 




(16) 
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The proof of this proposition is lengthy and will occupy several next 
pages. This is the main technical part of the proof of the main estimate, and 
therefore the Main Theorem as well. 

The strategy of the proof. Let us fix a number j as indicated in Propo- 
sition |2|. In most arguments we will have to distinguish between two cases. 

If the chain contains only few intervals, then both E(x) and E(x\^j) 
for j larger than A are small. In particular, we will assume without loss of 
generality that E(x\T>j) > 0. Otherwise, there is an element of T>j with no 
intervals from the chain in it. Then, any other element of that partition may 
intersect with at most one interval. Thus, E(x\T>j) would be exponentially 
small with respect to k — j by bounded geometry (see Fact |57T| . ) 

The second, more important and interesting situation is when the number 
of intervals is large, and individual expectations of x are much greater than 
our estimate. Then we will use "averaging" techniques based on the fact 
that the nonlinearity integral over the whole U is close to 0. Unfortunately, 
these approaches are mutually exclusive: we cannot use averaging if we only 
have few intervals in the chain. Thus, we will have to keep track of both 
possibilities throughout ther proof of Proposition [| 

We will look at the quantity 

_max{E(x\T> j )(x) : x £U} 
Vj ~ mm{E(x\^j)(x) : x (£ U} ~ 

We will prove that if j is more than A but sufficiently smaller than k, this 
quantity will show definite growth as j is increased. On the other hand, we 
will notice that there are certain bounds on its growth and this will let us 
assert that its initial value v \ must be very small. Proposition |2| will follow 
in this way. 

The conditional expectations E(x\D.) will be referred to shortly as den- 
sities. 

Technical preparations. 

Approximate invariance of x- 



30 



Lemma 5.7 Let J\ and J 2 be two similar^ elements of Dj. We further 
assume that neither of them is contained in U . To fix notations, we assume 
that J\ = f l (J2) where I may be negative, but \l\ < qj. 
If 

k- K l \\og{E{ X \D j ){J x ))\> 3 , 

thcfi 

I E( X \Dj)(Ji) «■ t K exp(j-«) 

1 2 e(xw ■ 

Proof: 

Since x is the characteristic function of a chain, x an d X / may differ on 
at most two intervals of the size order k. By bounded geometry, the length 
of an interval of fineness k is exponentially small with the exponent k — j 
compared to the element of Dj that contains it. So first we pick K\ to ensure 
that J\ contains at least two intervals of the chain. 

If this is true, the left-hand side of the second condition is roughly the 
ratio of the measure of one interval from the chain to the total measure of 
the portion of the chain in the the j'-th partition containing it. This means 
that 

E( X \Dj)(Ji) eM3-K)-\Ji\ 
which immediately yields the claim of the lemma. 



□ 



Bounds on t>j. 

Lemma 5.8 If k — K 2 \ \ogE(x)\ > j > A + K x then Vj < K 3 . 
Proof: 

There must be an interval J G Dj which is not adjacent to with density 
comparable to / x- Then by Lemma |5.7| we see it is possible to find K 2 so 



that x is sufficiently close to x f an d since by choosing K\ we can make 
sure that the jacobian /' has bounded variation when we go to an interval 
similar to J, we immediately obtain that the densities on intervals similar to 
J are all comparable to / x- 



3 That is, both lengthy or both short. 
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Let us assume that J is a lengthy interval. What we still need to show is 
the density on short intervals is also at least comparable. It cannot be much 
larger, since the images of short intervals will cover a definite portion of J in 
the next subdivision. 

• The subdivision of J contains a lengthy interval of Dj + i with density at 
least comparable to the density on J. Then, since this lengthy interval 
is an image of any short interval from Dj, the estimate follows. 

• All lengthy intervals subdividing J have very small density compared 
with the density on J. Then, the remaining short interval of Dj +1 must 
attain very high density. But the images of this interval will fill up a 
definite portion of short intervals of Dj and again the estimate follows. 

The case when J is a short interval can be solved using a similar, but 
easier, reasoning. 



Another technical lemma. 

Lemma 5.9 Suppose that we have two functions g and h on the unit interval 
and both positive and measurable with respect to the same a-algebra. Let us 
also assume that J h = 1 and ini{h(x) : x G [0, 1]} = C\ > 0. Suppose 
further that f g = 1 and f gh — 1 + e , e > 0. Then, 



□ 



max{g(:r) : x G /} 



> 1 + C 2 (d)e 



min{g(x) : x G /} 



where C2 is a continuous function of C\ only. 



Proof: 

We introduce 



We also define 



g':=g-l-e/2. 
I + = { x G I : g' > 0} and I = I \ I + . We have 
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But 



and thus 



Hence 



I g' < -c/2 

jf hg'<-C x e/2 
max{g'(x) : x E 1} > [ g'h> el 2(1 + C 



□ 



The minimum and maximum density in similar intervals. 
Lemma 5.10 Let us choose an integer j which satisfies 

A + Ki lEI < j < k . 



Suppose that the smallest density is attained on an element Y ofVj. and the 
largest density is on X G . 

Then, there are three possibilities: 

1. 

exp(j - k) 

Ko ; : — > K-IV2 ■ 



2. 



E( X ) < K 6 K, 



k-3 



where K 7 < 1 



V j+[K 4 \log(v 3 )\] V j >K $ V j 

All constants mentioned are positive.^ 



11 And the square brackets in the third condition mean the "ceil" function: 

[x] := inf{n 6 Z : n > x} . 



33 



Proof: 

We first explain the role of K±. By the bounded geometry and Koebe princi- 
ple, the iterate of / which maps X to Y, denoted 0, has bounded nonlinearity 
on X. By the bounded geometry again, the maximum length of elements of 
Dj+i 011 X tends to exponentially fast in Therefore, K4 can be chosen 
so that for I > K±\ log(v-)| the supremum of the quantity <f/(x) — <f/(y) on 
any element of Dj+i is less than K^v~- where K 8 can be made arbitraly small 
by choosing i\" 4 sufficiently large. 

Next, we will analyze the meaning of the first alternative. When we 
choose 

K 2 = K 2 r\ 



5.7 



In accordance with Lemma qJ\ the left-hand side of the inequality which is 
the first alternative bounds 

E{ X \Dj){X) E( X Q<J>\D;(X) _ 
^Eixo^D^Xy E(x\D 3 )(X) } ■ 

By choosing K3 appropriately, we can ensure that if the first alternative 
does not occur, the above quantity is as small compared with v-j as we may 
wish. 

Thus, we get 

' X -<P'>(1-K 3 v.) f x- (17) 

Y JX 



So we assume that the first possibility in Lemma |5.10| does not occur 
which technically means Equation |l^. The appropriate value of K3 will be 
chosen later. We also pick I = [-fQfj] + 1. 

Then we have to think what the second alternative means. According 
to Lemma p.8| either v-- < K 3 l^ or / x is exponentially small in k — j. 



Therefore, we can pick constants K 6 and K 7 in such a way that if the second 



alternative does not occur, then v- is bounded by K 3L J^ 



So, we now assume that neither the first nor the second alternative oc- 
curs, which technically means that formula |T7] holds with a small K3 to be 
arbitrarily chosen, and that Vj is uniformly bounded. 

12 The quickest way to see this is by the Uniform Bounded Distortion Lemma. But if 
one works a little harder and proves that the Schwarzian has a definite sign for all but 
very low iterates, the Koebe Priciple will be enough. 
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By our assumption and Lemma IST7 



J^ x . ( f ) '>(l + V ,-K 3 v ] -K 3 v?) J^x- (18) 
Next, we condition \ with respect to D-- +l : 

fx# = f X(0' - E((j>'\D hl )) + [ E( X \D ]+l ) ■ E{cj>'\D Ul ) (19) 

J Y JY JY 

The first term in Equation [19| is bounded by 

J y - E{<f>'\D Hl ) < K 8 v 3 j y E( X \D ]+l ) (20) 

Here, the reader may recall that K 8 is an auxiliary constant which can 
be made as small as we want by adjusting K±. 



Putting Equations 18, O and EQ together we get 



/ E{ J( \D^ l ).E^\D^ l )>(l+v^-K^-K,{l+v^v.) [ E( X \D~ j+l ) (21) 

J Y J Y 

We note here that v-- is bounded by i^3,z,.U and so we can pick K 3 and 
fT 4 which controls K 8 in such a way that 

K 5 = (1 - K 8 - K 3 (l + V] )) ■ K 3 ,l.0 - 1 > . 

Our final step is to conclude the third alternative from this inequality by 



Lemma 5.9 



□ 



Maximum and minimum density in intervals of different kind. This 
case will be solved by the following lemma: 

Lemma 5.11 Let us suppose that the minimum density is attained on a 
short interval X and the maximum density on a lengthy interval Y andv-- = e. 
Then, we look at the next dynamical partition, in which X becomes a lengthy 
interval and Y gets subdivided into a number of lengthy intervals Y{ and one 
short X' . At least one of the following holds true: 
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For some i 



E{ X \Y i )>K 1 -E(x\Y) 



with Ki < 1, but, on the other hand, subject to the condition K\ 



V] +1 >1 + K 2 e 
where K 2 > 1 is a uniform constant. 
Proof: 

This is a rather obvious fact. We will not give a detailed proof. Instead, 
we note informally that if the possibility does not occur it means that on 
all intervals Y{ the density is smaller by a definite amount than the density 
on the whole Y . Then, the density on X' has to exceed the density on Y. 
Moreover, it must be larger by a definite amount, since the relative measure 
of X' with respect to Y is not too big by the bounded geometry. 

□ 



A remark. Of course, the claim of Lemma |5.11| also holds if the largest 
density is attained on a short interval and the smallest on a lengthy one, and 
the proof is the same. 

Proof of Proposition |^. With all these technical facts we are ready to 
conclude the proof of Proposition 0. 



Let us discuss the meaning of Lemmas |5.10| and |5.11| . Consider a j < 
5'<(« + j)/2. 

Suppose the maximum ratio of densities is attained on intervals of 
which are of different kind. 

Then we want to use Lemma |5.10| , and suppose first that the first alter- 
native in Lemma occurs. If we multiply the corresponding inequality on both 
sides by E(x\X) and integrate over the whole U, we get what Proposition || 
claims, and even more, since there is no radical in the exponent. However, 
we get this claim for j, not j. Now, as j is no larger than j, the left-hand 
side in Proposition |2] will not increase if we replace j with j. On the other 
hand, as we assumed that j < (j + k)/2, the exponential right-hand side will 
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suffer only a bouned decrease if make such a replacement. Thus, Proposition 
is proven in that case. 

We leave it to the reader to show that Proposition |^ also holds if the 
second alternative occurs. 

If the third alternative in Lemma |5.10| is the only one, then we want to 
replace j with j + [K^p. 



5.1C 



is bound to increase by a c 



log Vj] . According to the claim of the Lemma, v 
efinite factor. 



Now, let us think about Lemma |5.11| . It says that in the next dynamical 



partition Vj +1 could be substantially more, or it may be less, but then the 
extrema are attained on intervals of the same kind. Moreover, in that second 
case the constants have been set up so that the increase implied by the 
subsequent use of Lemma |5.10| will offset the tiny decrease on the previous 



step. 

Thus, if start with j = j and follow this reasoning, we see that either 
Proposition |2| holds or Vj grows exponentially at a rate inversely proportional 
to | log(fj)| as we increase j. This means that as j finally reaches (k + j)/2, 



the log(f~) will have grown by roughly \J{k — j)/2. But, in view of Lemma 
|5T8| , Vj is either uniformly bounded, or Proposition |2| holds anyway, since 
E(x) is very small. Therefore, the initial value of logVj must be of the order 

Of yjK-j. 

This concludes the proof of Proposition |^. 

Main Theorem follows easily. We wish to bound 

f.X-n (22) 

where n is the nonlinear ity coefficient. 
We compute 

/. Xn = ! x(n - E[n\Dj)) + f X E(n\D,) (23) 

Here, j is chosen so it satisfies the assumptions of Proposition 0. The 
first term in Equation EBI is easily estimated using Lemma The lemma 



implies that n — E(n\Dj) is exponentially small with k — A and certainly the 
same is true of the integral. 
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Our further effort is then aimed at estimating the second term. 

f xEinlDj) = f E( X \D 3 )E(n\D 3 ) (24) 

JJJ JXJ 

= J_(E( X \D j )-E{x))E(n\D j ) + J.E^Dj) . 

Again, Equation |4| reduces the problem to bounding two terms. The 
first term is bounded based on Proposition ^| since the integral 

\E{x) - Eix\D s )\ 

is small and the nonlinearity coefficient is bounded. 

In the second term the constant can be moved from the integral and 
the remaining integral is exponentially close to as U was assumed to be 
symmetric and U differs from U at most by a length of the interval from the 
chain. 

This concludes the proof of the main estimate. Then, we can bound A 
in the Cancellation Lemma which immediately yields our Main Theorem. 



5.4 Final remarks. 

The pure singularity property and unimodal maps. The pure sin- 
gularity property for critical circle maps has an analogue in the study of 
unimodal maps with the dynamics of solenoidal type. Since in such a sit- 
uation the measure of the corresponding Cantor attractor is (see ||), it 
follows immediately (for example by Lemma 2TT , see also [|1]) that the joint 



distortion of the first return map due to parts of the interval where the nonlin- 
earity is bounded must tend to 0. So the counterpart of the pure singularity 
property also holds for unimodal maps and is not a very difficult fact in that 
context. Thus, the pure singularity property will allow us to extend certain 
results proved for unimodal maps to circle maps. For example, we can follow 
the line of argument by W. Paluba which was originally developed in the 
context of unimodal maps with solenoidal dynamics. 
The result we get for circle maps is this: 

Whenever two circle maps, each having a critical point of the same type, 
are Lipschitz- conjugate, the conjugacy is differentiable at the critical point. 

13 I learnt about the result by personal communication and it is to be part of the his 
Ph.D. thesis. 
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Asymptotic analyticity? For a large class of circle maps we can change 
the coordinates in such a way that the map becomes a polynomial in the 
neighborhood of the critical point. One is lead to expect that since asymp- 
totically the first return map tends to the composition of pieces of polynomial 
maps. If this convergence could be proven to be uniform in the vicinity of 
the circle on the complex plane, that might be an important step, possibly 
allowing the use of some machinery developed in |TJ. Unfortunately, the 
pure singularity property itself does not seem to allow that conclusion, as it 
gives the estimate on the circle only. Hopefully, the conjecture will be proven 
in a forthcoming paper. 
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